Setting Up Document Search

Keyoti

Configuring Keyoti Search

Triaster's preferred search utility is Keyoti Search.

Keyoti has the following advantages:

  • It offers an unlimited number of configurable tabs which can be pointed at different document sources.
  • Allows searching of SharePoint document repositories (provided a UNC path can be provided).
  • Offers 'Did you mean?' functionality and Auto-Complete to give a richer, more Google-like experience.
  • Can index and search many gigabytes of data with very little performance degradation.
  • Provides a simple and easy-to-use mobile interface.

This article explains how to configure search locations, set up a mobile search, create tabs and configure indexing.

Enabling Keyoti

Keyoti can be enabled by modifying Settings.xml to include a /Settings/Search/SearchType element as follows:

<?xml version="1.0" encoding="utf-8" standalone="yes"?

<Settings>

...

<Search>

<!-- Keyoti or Legacy -->

<SearchType>Keyoti</SearchType>

</Search>

...

</Settings>

This will enable Keyoti for all libraries and sites on the server.

Document Locations

The locations where Keyoti will look for documents to index are configured in the indexableSourceRecord.xml file. Each indexable source is configured as a local path/URL combination. Any documents found in the local path must be available in the same relative path through the URL to be included in the index.

Each indexable source can be assigned multiple categories. Categories match up with the tab names defined in Settings.xml to assign results to tabs when searching.

Multiple indexable sources can also share the same category. So it is possible to have a combination of global, library and site level indexable sources all show their results under a single tab on the search tool.

Indexable sources are scoped using locations, both in terms of which sites they are shown for, but also on what devices they display. Each indexable source can contain multiple location scopes, with no restriction on scope/device combinations. So it is possible to have a global, indexable source for desktop also available on a mobile for a single site.

Setting up a new global document store indexable source

  1. Ensure Keyoti is the configured search tool by following the steps outlined in the Enabling Keyoti section.
  2. If it doesn't already exist, create a Documents folder: for the purposes of this article it is assumed the Documents folder is C:\Triaster\Documents
  3. In IIS, add a new virtual director to the Default Web Site. Give it an Alias of Documents and a Physical Path to C:\Triaster\Documents
  4. Open Windows Explorer and browse to C:\Triaster\TriasterServer2011\KeyotiSearch\IndexDirectory
  5. Open indexableSourceRecord.xml in an XML editor
  6. Take a copy of the last DataSource node from start tag to end tag and paste within the DataSources node after all other DataSource nodes
  7. In the new DataSource, set the ID field to the value of the LastUsedID node plus 1, double checking the new id does not clash with any other id.
  8. Update the LastUsedID node to the new id.
  9. Edit the location attribute of the new DataSource to point to C:\Triaster\Documents\. Note the trailing "\", it is very important!
  10. Edit the query attribute of the new DataSource to point to http://localhost/Documents/. Note the trailing "/", it is very important! You can replace localhost with your computer name.
  11. Ensure the ExtensionData attribute is not excluding any sub folders and is recursing sub folders.
    • ExtensionData is 4 values separated by @ symbols.
    • Ensure there is no value between the first @ symbol and the second. This is the NoRecurseSubFolderMatchList. Typically it is used when indexing Visio exported HTML to exclude the filename.pagename_files folder by including _files as a match to exclude. It can however specify multiple matches to exclude separated by a pipe | symbol.
    • Ensure the value between the second @ symbol and the third is set to True. This is the RecurseSubFolders option and is typically set to True to index documents in sub folders.
  12. In the t:Categories sub node of the new DataSource ensure there is only one t:Category node and set its value to Documents as below.
  13. <t:Categories>

    <t:Category>Documents</t:Category>

    </t:Categories>

  14. Edit the t:Locations sub node to ensure it's globally scoped and available on all devices.
  15. In the t:Locations sub node of the new DataSource ensure there is only one t:Location node and remove its value as below. The absence of a library and site means this indexable source is global. The absence of a :mobile or :desktop modifier means this is available on both device types.

    <t:Locations>

    <t:Location></t:Location>

    </t:Locations>

  16. Save the indexableSourceRecord.xml file.
  17. Open up the Triaster Sample Library - Live site.
  18. Login to the Administration page and click on Refresh Document Search.
  19. As the new document store is global, refreshing any site will cause the new store to be re-indexed.

  20. After a couple of minutes, use the quick search option to search for MySearch, you should now receive results for the new document store under the Documents tab.

If you don't receive the expected results, consult the IndexLog.txt file in C:\Triaster\TriasterServer2011\KeyotiSearch.

Creating Tabs

The tabs (document categories) displayed in the Keyoti search tool and the default (first) tabs are configured per site for desktop and mobile in the Settings.xml file.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<Settings>

...

<PublicationSettings>

...

<Library Name="triaster sample library">

...

<Site Name="prelive">

...

<SearchCategories>Process Maps,Documents</SearchCategories>

<MobileSearchCategories>Process Maps,Documents</MobileSearchCategories>

...

</Site>

...

</Library>

...

</PublicationSettings>

...

</Settings>

The SearchCategories element contains a comma separated list of Categories/Tabs to display on the desktop Keyoti search tool. If omitted, it will default to Process Maps,Documents.

The MobileSearchCategories element contains a comma separated list of Categories/Tabs to display on the mobile Keyoti search tool. If omitted, it will default whatever is set in the SearchCategories element, if that is also omitted then it will also default to Process Maps,Documents.

The first tab specified is the default tab when searches are performed (see Setting the Default Search tab). Therefore if you prefer documents to be searched first, it should be set to Documents,Process Maps. Any number of categories/tabs can be added, the only restriction is screen real estate, especially on the mobile version. Once you get more than one line of tabs, they will not align as intended.

The tab names correspond with the categories defined in the indexableSourceRecord.xml file. Any indexable sources defined for the site (either global, library or site level) that have the Category defined will be displayed under that tab.

Setting the Default Search tab

These instructions show how to set the Documents tab as the default tab. The same method can be used to set up any custom tab as the default.

  1. Open Windows Explorer and browse to C:\Triaster\TriasterServer2011\Settings
  2. Open settings.xml in Notepad++
  3. Locate the /Settings/PublicationSettings/Library[@Name="triaster sample library"]/Site[@Name="live"] node.
  4. Within the Site node, add a new sub node called SearchCategories.
  5. Set its value to Documents,Process Maps.
  6. .

    <Site Name="live">

    <SearchCategories>Documents,Process Maps</SearchCategories>

    .

    </Site>

    .

  7. Open up the Triaster Sample Library - Live site
  8. Use the quick search option to search for MySearch, you should now receive results on the Documents tab by default.

Indexing

Triaster have developed a command-line tool that can be used to (re-)index documents. By default this is installed to C:\Triaster\TriasterServer2011\KeyotiSearch\KeyotiReindex.exe.

The following command line arguments are supported:

  • /ID - Optionally used to specify the index directory. If omitted, the directory will be identified by looking at the install folder in the registry and appending \KeyotiSearch\IndexDirectory.
  • /L - Optionally used to specify a library. When specified, index-able sources that apply to that library (site level, library level or global) will be (re-)indexed.
  • /S - Optionally used to specify a site. When specified, index-able sources that apply to that site (site level, library level or global) will be (re-)indexed. When specified, /L is required.
  • /C - Optionally used to specify a comma separated list of categories. When specified, only those index-able sources with matching categories will be (re-)indexed.
  • /EC - Optionally used to specify a comma separated list of categories to exclude. When specified, any index-able sources with matching categories will NOT be (re-)indexed. Cannot be used in conjunction with /C.
  • /Q - Quick mode. When specified, orphaned documents (Those whose index-able source is no longer configured in indexableSourceRecord.xml file) will not be identified and removed, and the index will not be optimized. Useful when running KeyotiReindex.exe multiple times to ensure work is not duplicated, but can then be omitted on the last call.
  • Logging can be performed by redirecting the console applications output to a text file. This is done by appending > logfile.txt to write to create a new log file, or >> logifle.txt to append to an existing log file. Rather than output progress to the command window, progress will be written to the specified log file.

Re-index Documents on Publish

Documents can be reindexed automatically after a publishing event (although this is not set by default), to follow a specific schedule or performed manually as required. Reindexing is also necessary after cloning or deleting a site/library.

Re-index maps on Publish

By default, in Triaster Server, Keyoti will be (re-)indexed on publish using the PostPublishReindex.cmd file which must be modified to uncomment (remove REM) the call to KeyotiReindex.exe (in red below), by default this is installed to C:\Triaster\TriasterServer2011\KeyotiSearch\PostPublishReindex.cmd.

REM Set the current directory to the one containing this CMD file

cd /d %~dp0

REM Log file

set LogFile=IndexLog.txt

set Library=%~1

set Stage=%~2

set LibraryMapsFolder=%~3

set MapsFolder=%~4

REM KeyotiReindex.exe /l:"%Library%" /s:"%Stage%" > %LogFile%

This will enable (re-)indexing on publish, for any index-able source that relates to the published site (site level, library level or global) for all category types. If you wish to restrict it to just (re-)index published html for example, you would need to modify the call to KeyotiReindex.exe to include /C:"Process Maps".

On Demand

By default, in Triaster Server, Keyoti will (re-)index using the Refresh Document Search option in the Administration tool.

To customise what Keyoti does when the Refresh Document Search option in the Administration tool is used, you need to modify the DocumentReindex.cmd file, by default this is installed to C:\Triaster\TriasterServer2011\KeyotiSearch\DocumentReindex.cmd. The DocumentReindex.cmd file is similar to the PostPublishReindex.cmd file above in its structure.

Scheduling

Because a (re-)index can lock the index directory, it is not possible to run multiple (re-)indexes at the same time. Therefore it is not advisable to schedule calls to the DocumentReindex.cmd file directly, as they may conflict with a scheduled publication's post publish (re-)index and cause one or the other not to happen due to the index directory being locked. The solution is to make use of the PublicationServer's queuing mechanism. When using the Refresh Document Search option in the Administration tool, a queue file is created in the PublicationServer's queue folder (default is C:\Triaster\TriasterServer2011\Queue). The format of the queue file is library_site.keyotitask, replacing library and site with the library and site to (re-)index. Scheduling the creation of this file will ensure it is queued alongside any publications that may be taking place at the same time.

Cloning

When cloning a Library or Site, any index-able sources for Keyoti (in the indexableSourceRecord.xml file) will be cloned as appropriate, unless the DataSource already covers the new Library or Site, by being global or library level for a site clone.

If the DataSource does not already cover the new Library or Site, then if the location of the DataSource is underneath the Library or Site being cloned, a new DataSource record is added. Otherwise, if the DataSource is not underneath the Library or Site being cloned, a new t:Location element is added to the existing DataSource record to enable it for the cloned Library or Site. A (re-)index will need to be performed following a publish for any of these changes to appear on the Keyoti search tool.

Deleting

When deleting a Library or Site, any t:Location elements are removed as appropriate, and if the DataSource is exclusive to the Library or Site being deleted, the DataSource will be removed. A (re-)index of any other Library or Site will need to happen before the records are completely removed from the index, but as the Library or Site has been deleted, there should be no way to bring any orphaned documents up in a search unless someone modifies the address bar in a browser to specify the deleted Library or Site.